The External Validation of GLOBE and UK-PBC Risk Scores for Predicting Ursodeoxycholic Acid Treatment Response in a Large U.S. Cohort of Primary Biliary Cholangitis Patients

Background: The cornerstone treatment for primary biliary cholangitis (PBC) is ursodeoxycholic acid (UDCA), but many patients exhibit an incomplete response, leading to disease progression. Risk prediction models like the GLOBE and UK-PBC scores hold promise for patient stratification and management. We aimed to independently assess the predictive accuracy of these risk scores for UDCA response in a prospective U.S. cohort. Methods: We conducted a prospective cohort study at a U.S. liver center, monitoring UDCA-treated PBC patients over a one-year follow-up. We evaluated the predictive efficacy of the GLOBE and UK-PBC scores for UDCA treatment response, comparing them to the Paris II criteria. Efficacy was assessed using univariate and multivariate analyses, followed by prognostic performance evaluation via receiver operating characteristic (ROC) curve analysis. Results: We evaluated 136 PBC patients undergoing UDCA therapy. Based on the Paris II criteria, patients were categorized into UDCA full-response and non-response groups. The GLOBE score identified a non-responder rate of 18% (p = 0.205), compared to 20% (p = 0.014) with the Paris II criteria. Multivariate analysis, adjusted for age and biochemical markers, showed that both the GLOBE and UK-PBC scores were strongly associated with treatment response (p < 0.001). The area under the ROC curve was 0.87 (95% CI 0.83−0.95) for the GLOBE score and 0.94 (95% CI 0.86−0.99) for the UK-PBC risk score. Conclusions: Our study demonstrates that GLOBE and UK-PBC scores effectively predict UDCA treatment response in PBC patients. The early identification of patients at risk of an incomplete response could improve treatment strategies and identify patients who may need second-line therapies.


Introduction
Primary biliary cholangitis (PBC) is a chronic, progressive cholestatic liver disease characterized by the autoimmune destruction of intrahepatic bile ducts.The estimated global incidence and prevalence rates of PBC are 1.76 and 14.60 per 100,000 persons, respectively [1].In the United States (U.S.), the annual incidence of PBC is higher, at 2.75 per 100,000 persons, compared to Europe at 1.86, with the lowest incidence observed in the Asia-Pacific region at 0.84 [2].PBC predominantly affects females (>90%) and is typically diagnosed in the fourth or fifth decade of life [3].PBC is a progressive disease; without treatment, it advances to end-stage liver disease.A prospective study by Christensen et al. revealed that untreated PBC patients exhibited histologic progression within two years [4].
The risk of developing decompensated cirrhosis has been estimated at 15% to 25% over five years [5].
Treatment with ursodeoxycholic acid (UDCA) improves liver biochemistry, slows hepatic fibrosis progression, and may extend life expectancy [6].UDCA has transformed the treatment landscape for PBC, with transplant-free survival rates of 79.7% among treated patients compared to 60.7% among untreated patients [7].However, a substantial proportion of patients exhibit an inadequate response to UDCA, leading to a higher risk of liver-related disease progression [8].Several criteria for evaluating UDCA treatment response have been developed, including the Rotterdam, Barcelona, Rochester-II, Paris, Toronto, and Ehime criteria.These prognostic models assess therapeutic effects based on liver biochemical parameters after 6, 12, or 24 months of UDCA treatment [9].Despite their utility, these criteria have limitations, particularly for patients with inadequate responses, as they may continue ineffective treatment for extended periods, increasing the risk of disease progression [10].
Two independent research groups, the Global PBC Study Group and the United Kingdom (UK)-PBC Consortium, have developed and externally validated continuous prognostic models: the GLOBE and UK-PBC risk scores, respectively [11,12].The GLOBE score was designed to estimate the risk of liver transplantation (LT) or overall death in patients with PBC who have been treated with UDCA for one year [10].A previous retrospective study by Harms et al. estimated the score for both patients treated with UDCA and those who were not, evaluating LT-free survival [7].Additionally, changes in the GLOBE score from baseline to one year after the initiation of UDCA and from one year to two years after the initiation of UDCA have been associated with LT or death [13].Similarly, the UK-PBC risk score was developed to predict the risk of developing end-stage liver disease in patients treated with UDCA [12] and has been validated in North American cohorts [14].Although both scores were originally validated in a Western population, a study from China demonstrated that these scores also provide reliable estimates in diverse ethnic populations [15].
Previously, single-center external validations of the GLOBE and UK-PBC scores within the U.S. included a validation study of the UK-PBC score at the Mayo Clinic, which involved a 20-year retrospective cohort of 464 patients with PBC [14].Additionally, a retrospective study at the Cleveland Clinic evaluated both the GLOBE and UK-PBC scores in 352 PBC and PBC/overlap patients treated with UDCA between 1998 and 2017 [16].Our study represents the largest prospective cohort study conducted at a tertiary center in the U.S., focusing exclusively on patients with PBC-only diagnoses.
Risk prediction models such as the GLOBE and UK-PBC scores play a crucial role in informing decision-making and guiding future patient management.It is essential that these models demonstrate transferability and can be confidently applied across diverse patient populations with PBC.Therefore, robust validation in external patient cohorts is essential before integrating these models into clinical practice.In this study, we aimed to independently evaluate the predictive performance of the GLOBE and UK-PBC risk scores in response to UDCA treatment in a prospective cohort in the United States.

Study Population
The subjects for this study were enrolled in a prospective autoimmune liver registry at Beth Israel Deaconess Medical Center (BIDMC; Boston, MA, USA).Enrollment occurred between January 2018 and November 2023.Subjects were eligible if they met the PBC diagnosis criteria based on internationally accepted standards, as recommended by the European Association for the Study of the Liver (EASL).Diagnosis of PBC was confirmed by the presence of at least two of the following criteria: (a) elevated alkaline phosphatase (ALP) levels; (b.1) the presence of antimitochondrial antibody (AMA) at a titer >1:40; or (b.2) the presence of anti-sp100/anti-glycoprotein 210 (anti-gp210); or (c) in cases of AMA-negative subjects, a liver biopsy showing classic histologic findings of PBC.
The exclusion criteria were as follows: age under 18 years at study entry, the presence of any autoimmune overlap syndrome, history of concomitant liver disease, missing data that prevented the assessment of treatment response, missing predictors for any of the risk scores, the absence of UDCA treatment or an unknown date of initial UDCA treatment, and the discontinuation of UDCA treatment within the first year (Figure 1).Ultimately, we analyzed 136 adult PBC patients who met the specified cohort characteristics.
The exclusion criteria were as follows: age under 18 years at study entry, the presence of any autoimmune overlap syndrome, history of concomitant liver disease, missing data that prevented the assessment of treatment response, missing predictors for any of the risk scores, the absence of UDCA treatment or an unknown date of initial UDCA treatment, and the discontinuation of UDCA treatment within the first year (Figure 1).Ultimately, we analyzed 136 adult PBC patients who met the specified cohort characteristics.

Study Outcome, Variables, and Definitions
The study's outcome was the response to UDCA therapy, defined by the Paris II criteria as ALP levels ≤ 1.5 times the upper normal limit (ULN), AST levels ≤ 1.5 times the ULN, or bilirubin levels < 1 mg/dL after one year of treatment.In this study, patients received the recommended UDCA dose (13 to 15 mg/kg).
We evaluated the predictive efficacy of the GLOBE score for UDCA treatment response, comparing it to the Paris II criteria.Patients with a GLOBE score above 0.30 were classified as non-responders, whereas those with a GLOBE score of 0.30 or less were classified as responders [11].The GLOBE score was calculated using the following equation:

Study Outcome, Variables, and Definitions
The study's outcome was the response to UDCA therapy, defined by the Paris II criteria as ALP levels ≤ 1.5 times the upper normal limit (ULN), AST levels ≤ 1.5 times the ULN, or bilirubin levels < 1 mg/dL after one year of treatment.In this study, patients received the recommended UDCA dose (13 to 15 mg/kg).
We evaluated the predictive efficacy of the GLOBE score for UDCA treatment response, comparing it to the Paris II criteria.Patients with a GLOBE score above 0.30 were classified as non-responders, whereas those with a GLOBE score of 0.30 or less were classified as responders [11].The GLOBE score was calculated using the following equation: GLOBE score = 0.044378 × age at start of UDCA therapy + 0.93982 × ln (TB times the upper limit of normal [ULN] at 1 year follow-up) + 0.335648 × ln (ALP × ULN at 1 year follow-up) − 2.266708 × ALB level × the lower limit of normal (LLN) at 1 year follow-up −0.002581 × PL count per 109/L at 1 year follow-up +1.216865 (2)

Statistical Analysis
Demographic, clinical, and biochemical markers were collected, presenting continuous variables with a normal distribution as mean and standard deviation (SD) and non-normal variables as median and interquartile range (IQR).Comparisons were made using a t-test or Mann-Whitney U test, as appropriate.Categorical variables, summarized as percentages, were compared using Pearson's chi-squared test (χ 2 ).
For the assessment of predictors of treatment effectiveness, we employed logistic regression models.The preliminary univariate model included potential covariates at index (age and laboratory results for ALP, ALT, AST, TB, ALB, PL, GLOBE, and UK-PBC scores).Variables with p values < 0.05 were retained in the multivariate model, and results are presented as odds ratios (ORs) with 95% confidence intervals (CIs), with statistical significance defined as p < 0.05.
The predictive performance was assessed by calculating and plotting the area under the receiver operating characteristic curve (AUROC) and estimating the 95% CI for each risk score.
All statistical analyses were conducted using Stata version 18.0 (StataCorp LP, College Station, TX, USA).

Baseline Characteristics
Our study included 136 patients diagnosed with PBC who had been on continuous UDCA therapy for a year.The baseline characteristics of the cohort are detailed in Table 1.The majority of patients were female (90%) and predominantly white Caucasian (79.0%), with a mean age of 56 years.For PBC diagnosis, 82% of patients met ALP criteria, 63% had positive antibodies, and 51% had compatible liver biopsies.Additionally, 12.5% of patients already had cirrhosis at their initial visit.

Discussion
Our findings validate the utility of both the GLOBE and UK-PBC risk scores in accurately predicting treatment response in PBC.The GLOBE score, proposed by Lammers et al. [11], was developed using a derivation cohort of 2488 cases and a validation cohort of 1634 patients.This score combines predictive information on disease severity and treatment response.Initially designed to estimate the risk of death or LT after one year of UDCA therapy, recent studies indicate that the GLOBE score can also stratify UDCA-treated patients beyond that period [17].
Approximately 40% of PBC patients exhibit an incomplete response to UDCA, resulting in a worse prognosis compared to responders [3].Moreover, other studies have reported that 20-30% of PBC patients exhibit incomplete biochemical responses to UDCA, highlighting the benefits of individualized treatment plans and personalized management strategies [18][19][20].An international cohort study validated the GLOBE score using data from the Global PBC Study Group, which included patients from eight countries in Europe and North America [7].Among 3433 patients treated with UDCA, 733 (21.4%) were classified as inadequate responders one year after initiating UDCA therapy.Similarly, our results showed that 23 patients (18%) were classified as non-responders using the GLOBE score.In comparison, the Paris criteria identified 26 patients (20%) with an inadequate response to treatment.These results were expected, as the Paris II criteria consider three biochemical parameters, while the GLOBE score includes a broader range of parameters [21].
The lower percentage of non-responders to UDCA treatment in these study cohorts can be attributed to several factors supported by the existing literature.Firstly, our study cohort consisted of a significant proportion of subjects with early PBC, as it was developed for a prospective registry with a large proportion of patients enrolled at diagnosis.Hirschfield et al. [22] suggest that the early initiation of UDCA treatment, particularly within the first two years of diagnosis, is associated with improved response rates and slower disease progression.Furthermore, we only included patients who met the inclusion criteria, which were more stringent than those of other validation cohorts.
Despite their importance as prognostic markers, hepatic transaminases are not included in the GLOBE score.Our study found that AST and ALT were not significantly associated with an incomplete response to UDCA treatment, as confirmed by multivariate analysis.Previous research supports these findings.Mane et al.'s retrospective study of 53 PBC patients treated with UDCA for one year showed a significant reduction in ALP but no significant decrease in paired AST and bilirubin levels [23].Similarly, Cortez-Pinto et al. found that ALT levels and increased bilirubin were not associated with an incomplete response [21].Additionally, our findings demonstrated a significant association between ALP levels and treatment response, which is supported by Lammers et al.'s metaanalysis.This analysis demonstrated a log-linear relationship between ALP levels and LT-free survival, indicating that lower alkaline phosphatase values correlate with longer LT-free survival [24].
Our study found that age was not significantly associated with an incomplete response to treatment, consistent with findings from a previous Portuguese observational cohort study of 434 PBC patients assessing UDCA treatment response [21].Previous studies have indicated that younger patients at diagnosis have a higher risk of treatment failure, likely due to presenting with a more severe form of the disease, potentially associated with a ductopenic phenotype resistant to UDCA therapy [25].In contrast, another study identified older age at diagnosis as an independent predictor of mortality in PBC patients [26].
The one-year follow-up design and the inclusion criteria of this study limit the assessment of long-term hepatic decompensation events, which were not evident in the definitive cohort.A retrospective study by Gazda et al., including 249 Slovakian patients, primarily aimed to evaluate the risk of hepatic decompensation after UDCA therapy by assessing prognostic factors in PBC over a ten-year span.The study demonstrated that treatment failure after six months of UDCA therapy is linked to a 12-fold increase in the risk of liver decompensation, including ascites, hepatic encephalopathy, or variceal bleeding.Additionally, treatment failure after twelve months of UDCA therapy is associated with a 22-fold increase in the risk of liver decompensation [27].
The GLOBE score is an essential tool for managing PBC patients, providing an effective assessment of both treatment response and the risk of adverse outcomes [11,28].Data from the Global PBC cohort indicate that changes in the GLOBE score during the first and second years predict death/LT-free survival, with hazard ratios (HRs) of 2.28 (p < 0.001) and 2.19 (p < 0.001), respectively, independent of the baseline score [29].These findings suggest that monitoring the GLOBE score at multiple time points can enhance the accuracy of outcome prediction [28].Additionally, a recent study by Montano et al., including 332 patients with recurrent PBC after LT from 28 centers across Europe, North America, and South America, demonstrated that both the GLOBE score and the UK-PBC score can identify patients at higher risk of graft loss and mortality post-LT [30].
The UK-PBC score, developed by Carbone et al., is based on a large cohort of 1916 British patients and has been further validated in an independent cohort of 1249 patients [12].This risk score allows for the accurate long-term prediction of LT and liver-related death over 5, 10, and 15 years, with an AUROC exceeding 0.90.Our study examined the UK-PBC score as a prognostic tool for treatment response and revealed temporal variability in overall risk before and after UDCA treatment.This improvement in prognostic scores following UDCA treatment aligns with previous studies correlating biochemical response with survival outcomes.A study of 192 PBC patients treated with UDCA demonstrated that a good biochemical response after one year is associated with survival rates similar to those of a matched control population, highlighting the beneficial effects of UDCA in PBC [31].Additionally, a systematic review and meta-analysis by Gazda et al. showed that the HR of the continuous UK-PBC risk score was 3.39 for liver events (95% CI: 3.10-3.72),while the HR of the binary form was 2.76 (95% CI: 2.14-3.69)[32].These findings support the utility of the UK-PBC score in predicting long-term outcomes and the effectiveness of UDCA treatment in PBC.
The original studies highlighted the excellent predictive abilities of the GLOBE and UK-PBC risk scores [11,12].Our study cohort also demonstrated comparable and highly accurate results across overall risk scoring systems, with the GLOBE score having an AUROC greater than 0.87 and the UK-PBC score having an AUROC greater than 0.94.The high discriminative performance of the UK-PBC risk score suggests the effective categorization of individuals based on their likelihood of responding to treatment [33].Several factors contribute to the excellent performance of these models.First, the GLOBE and UK-PBC scores incorporate multiple key independent variables such as age, bilirubin, ALP, albumin, and platelet count, unlike other criteria that solely focus on treatment response and may not account for cirrhosis.Additionally, most other models rely on laboratory data collected one year after starting UDCA treatment, while the GLOBE and UK-PBC scores use laboratory values from two time points (baseline and one year after starting UDCA), enhancing their precision [33,34].Thus, the dichotomization of continuous variables in previous models can impact their robustness [33].
The prospective design of this study has inherent limitations.The primary constraint is the size of our cohort.Conducting large single-center prospective studies on PBC is challenging due to the disease's low prevalence and slow progression.Additionally, we excluded patients with incomplete information, which may have introduced selection bias.Furthermore, other clinical events that arise during PBC may interfere with the performance of risk scores.Finally, while our results confirm the predictive ability of the UK-PBC and GLOBE risk scores for treatment response, they do not address the calibration of these scores, as our outcome definition differs from that of the original studies.
While we acknowledge the study's limitations, these challenges underscore the importance of extended observation periods for a more comprehensive understanding of PBC patients.Future research should prioritize validating prognostic factors through a combination of retrospective and prospective studies, evaluating their contribution to newly developed prediction models.

Conclusions
Our study demonstrated the good prognostic performance of the GLOBE and UK-PBC scores in predicting UDCA treatment response for PBC patients.The early identification of patients at risk of an incomplete response could enhance treatment strategies.Consequently, these response criteria are crucial for selecting patients who may require additional secondline therapies.

Table 2 .
Characteristics of responders and non-responders after 12 months of UDCA treatment (n = 136).

Table 3 .
Changes in UK-PBC score after 12-month UDCA treatment.

Table 4 .
Univariate and multivariable logistic regression analysis for responders to UDCA treatment.